Search CORE

39 research outputs found

Plagiarism Detection: Keeping Check on Misuse of Intellectual Property

Author: Joshi Nisheeth
Mathur Iti
Publication venue
Publication date: 01/11/2011
Field of study

Today, Plagiarism has become a menace. Every journal editor or conference organizers has to deal with this problem. Simply Copying or rephrasing of text without giving due credit to the original author has become more common. This is considered to be an Intellectual Property Theft. We are developing a Plagiarism Detection Tool which would deal with this problem. In this paper we discuss the common tools available to detect plagiarism and their short comings and the advantages of our tool over these tools

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Design of English-Hindi Translation Memory for Efficient Translation

Author: Joshi Nisheeth
Mathur Iti
Publication venue
Publication date: 19/10/2012
Field of study

Developing parallel corpora is an important and a difficult activity for Machine Translation. This requires manual annotation by Human Translators. Translating same text again is a useless activity. There are tools available to implement this for European Languages, but no such tool is available for Indian Languages. In this paper we present a tool for Indian Languages which not only provides automatic translations of the previously available translation but also provides multiple translations, in cases where a sentence has multiple translations, in ranked list of suggestive translations for a sentence. Moreover this tool also lets translators have global and local saving options of their work, so that they may share it with others, which further lightens the task.Comment: Proceedings of National Conference in Recent Advances in Computer Engineering, 201

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Input Scheme for Hindi Using Phonetic Mapping

Author: Joshi Nisheeth
Mathur Iti
Publication venue
Publication date: 01/03/2010
Field of study

Written Communication on Computers requires knowledge of writing text for the desired language using Computer. Mostly people do not use any other language besides English. This creates a barrier. To resolve this issue we have developed a scheme to input text in Hindi using phonetic mapping scheme. Using this scheme we generate intermediate code strings and match them with pronunciations of input text. Our system show significant success over other input systems available

CogPrints Cognitive Sciences Eprint Archive

Evaluation of Computational Grammar Formalisms for Indian Languages

Author: Joshi Nisheeth
Mathur Iti
Publication venue
Publication date: 01/11/2010
Field of study

Natural Language Parsing has been the most prominent research area since the genesis of Natural Language Processing. Probabilistic Parsers are being developed to make the process of parser development much easier, accurate and fast. In Indian context, identification of which Computational Grammar Formalism is to be used is still a question which needs to be answered. In this paper we focus on this problem and try to analyze different formalisms for Indian languages

CogPrints Cognitive Sciences Eprint Archive

OntoAna: Domain Ontology for Human Anatomy

Author: Joshi Nisheeth
Mathur Iti
Vashisth Archana
Publication venue
Publication date: 01/01/2012
Field of study

Today, we can find many search engines which provide us with information which is more operational in nature. None of the search engines provide domain specific information. This becomes very troublesome to a novice user who wishes to have information in a particular domain. In this paper, we have developed an ontology which can be used by a domain specific search engine. We have developed an ontology on human anatomy, which captures information regarding cardiovascular system, digestive system, skeleton and nervous system. This information can be used by people working in medical and health care domain.Comment: Proceedings of 5th CSI National Conference on Education and Research. Organized by Lingayay University, Faridabad. Sponsored by Computer Society of India and IEEE Delhi Chapter. Proceedings published by Lingayay University Pres

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive

Improving the quality of Gujarati-Hindi Machine Translation through part-of-speech tagging and stemmer-assisted transliteration

Author: Ameta Juhi
Joshi Nisheeth
Mathur Iti
Publication venue
Publication date: 01/06/2013
Field of study

Machine Translation for Indian languages is an emerging research area. Transliteration is one such module that we design while designing a translation system. Transliteration means mapping of source language text into the target language. Simple mapping decreases the efficiency of overall translation system. We propose the use of stemming and part-of-speech tagging for transliteration. The effectiveness of translation can be improved if we use part-of-speech tagging and stemming assisted transliteration.We have shown that much of the content in Gujarati gets transliterated while being processed for translation to Hindi language

CogPrints Cognitive Sciences Eprint Archive

Development of a Hindi Lemmatizer

Author: Joshi Nisheeth
Mathur Iti
Paul Snigdha
Publication venue: Arohan Publishers
Publication date: 01/05/2013
Field of study

We live in a translingual society, in order to communicate with people from different parts of the world we need to have an expertise in their respective languages. Learning all these languages is not at all possible; therefore we need a mechanism which can do this task for us. Machine translators have emerged as a tool which can perform this task. In order to develop a machine translator we need to develop several different rules. The very first module that comes in machine translation pipeline is morphological analysis. Stemming and lemmatization comes under morphological analysis. In this paper we have created a lemmatizer which generates rules for removing the affixes along with the addition of rules for creating a proper root word

A Lightweight Stemmer for Gujarati

Author: Ameta Juhi
Joshi Nisheeth
Mathur Iti
Publication venue
Publication date: 01/12/2011
Field of study

Gujarati is a resource poor language with almost no language processing tools being available. In this paper we have shown an implementation of a rule based stemmer of Gujarati. We have shown the creation of rules for stemming and the richness in morphology that Gujarati possesses. We have also evaluated our results by verifying it with a human expert

arXiv.org e-Print Archive

CogPrints Cognitive Sciences Eprint Archive